30 research outputs found

    TEXT CATEGORIZATION USING ONLY FRAGMENTS OF DOCUMENTS

    Get PDF
    In this paper we presented a lot of experiments that examine how the particular parts of the documents do contribute to the performance of a classifier. We evaluated text classifiers on two very different text corpora. We conclude that some parts of the text are more important from the point of text classification performance. Giving higher weights to more important parts can increase the performance of the classifier. The question, that which parts are more or less important depends on the nature of the documents in the corpora. Some tasks that remains to be done: − More text corpora should be investigated. − In section 6.4 we optimized the number of features to be kept independent from the section. However, it could be optimized for each section. − Splitting the documents into parts of 50 words, to examine what if the parts are of equal size not only inside a document, but among the documents too. − When splitting documents into k equal parts, we may combine the classifiers resulted from different k values.machine learning, text categorization, classifier ensembles, Research and Development/Tech Change/Emerging Technologies,

    Tadeusz Dobrowiecki - Méréstechnikai alkalmazások, orvosbiológiai informatika

    Get PDF

    CONSTRAINTS: A PROGRAMMING PARADIGM AND A MODELLING METHODOLOGY

    Get PDF
    Constraints are often used as a formal approach to problems, because the very essence of the problem can be grasped by them. A lot of problems can be viewed as a set of variables and a set of relations on them. From this point of view the problem can be mapped naturally to a constraint network (the nodes of the network represent the variables; and the constraints in the network represent the relations between the variables of the problem); and this gives great significance to the research on constraints. An additional advantage is that they achieve global consistency through local computations. Constraints and the Constraint Satisfaction Problem (CSP) can be classified by various criteria. The most significant classification is based on the type of the values assigned to the nodes. Another possible classification of CSP is based on the kind of the required solution. Significant effort was invested in developing general constraint programming languages (CPL) to provide an environment where the only thing a user has to do is to declare what she/he wants, not bothering how it is done. Though these languages aimed at generality, due to the limited ability of data abstraction and higher order constraints they could not fully achieve their goal. If the main stress is on the efficiency, dedicated solutions claim their place with their unique data structures and specialised constraint satisfaction algorithms. The main goal of this paper is to give an overview of constraints as a flexible knowledge representation tool; to draw attention to the problems of representation and to methods of finding the solutions of the different types of constraint networks

    Adapting IT Algorithms and Protocols to an Intelligent Urban Traffic Control

    Get PDF
    Autonomous vehicles, communicating with each other and with the urban infrastructure as well, open opportunity to introduce new, complex and effective behaviours to theintelligent traffic systems. Such systems can be perceived quite naturally as hierarchically built intelligent multi-agent systems, with the decision making based upon well-defined and profoundly tested mathematical algorithms, borrowed e.g. from the field of information technology. In this article, two examples of how to adapt such algorithms to the intelligent urban traffic are presented. Since the optimal and fair timing of the traffic lights is crucial in the traffic control, we show how a simple Round-Robin scheduler and Minimal Destination Distance First scheduling (adaptation of the theoretically optimal Shortest Job First scheduler) were implemented and tested for traffic light control. Another example is the mitigation of the congested traffic using the analogy of the Explicit Congestion Notification (ECN) protocol of the computer networks. We show that the optimal scheduling based traffic light control can handle roughly the same complexity of the traffic as the traditional light programs in the nominal case. However, in extraordinary and especially fastly evolving situations, the intelligent solutions can clearly outperform the traditional ones. The ECN based method can successfully limit the traffic flowing through bounded areas. That way the number of passing-through vehicles in e.g. residential areas may be reduced, making them more comfortable congestion-free zones in a city

    Rendszermodellezés mérési adatokból, hibrid-neurális megközelítés = System modelling from measurement data: hybrid-neural approach

    Get PDF
    A kutatás célja mérési adatok alapján történő rendszermodellezési eljárások kidolgozása és vizsgálata volt, különös tekintettel a nemlineáris rendszerek modellezésére. A kutatás során többféle megközelítést alkalmaztunk: egyrészt a rendszermodellezési feladatok megoldásánál a lineáris rendszerekre kidolgozott eljárásokból indultunk ki nemlineáris hatásokat is figyelembe véve, másrészt fekete doboz megközelítéseket alkalmaztunk, ahol elsődlegesen input-output adatokból történik a modell konstrukció. Az előbbi megközelítés különösen gyengén nemlineáris rendszerek modellezésénél tűnik járható útnak, ahol a gyengén nemlineáris rendszereket, mint nemlineárisan torzított lineáris rendszereket tekintjük. A nemlineáris torzítások hatásának megértésére egy teljes elméletet dolgoztunk ki. A fekete doboz modellezésnél általános modell-struktúrákból indulunk ki, melyek paramétereit a rendelkezésre álló mérési adatok felhasználásával, tanulással határozhatjuk meg. Ekkor az alapvető kérdések a megfelelő kiinduló adatbázis kialakítására és az adatokkal kapcsolatos problémákra (zajos adatok, kiugró adatok, inkonzisztens adatok, redundáns adatok, stb.) irányultak, továbbá arra hogy hogyan lehet a fekete doboz modellstruktúra komplexitását kézben tartani és az adatokon túl meglévő egyéb információ hatékony figyelembevételét biztosítani. A fekete doboz modellezésnél neuronhálókat és szupport vektor gépeket vettünk figyelembe és a minél kisebb modell-komplexitás elérésére törekedtünk. | The goal of the research was to develop and analyse system modelling procedures, especially for modelling non-linear systems. To reach the goal different approaches were applied. One approach is to use procedures developed for linear system modelling, where nonlinear effects are taken into consideration. The other approach applied is black box modelling, where model-construction is mainly based on input-output data. The first approach proved to be successful especially for the modelling of weakly non-linear systems, where these systems are considered as linear ones with the presence of nonlinear distortion. To understand nonlinear distortions a whole theory has been developed. For black box modelling the starting point was the use of certain general model-structures, where the parameters of these structures are determined by training using measurement data. The most relevant questions in this case are related to the construction of data base, and the problems of quality of the available data (noisy data, missing data, outliers, inconsistent data, redundant data, etc.), A further important goal was to find proper ways to utilise additional knowledge and at the same time to reduce model complexity. For black box modelling some special neural network architectures and support vector machines were considered

    OPTIMUM PROBLEMS IN CONSTRAINT SATISFACTION

    Get PDF
    With the advanced spread of Expert Systems (ES) more and more emphasis has been put on efficient knowledge representation. Constraints, a fast developing field in Artifi- cial Intelligence (AI), try to cope with the new requirements imposed on the knowledge representation tools in practical applications. Efficient algorithms operating on constraint networks have been developed [4, 7, 10], and recently some constraint systems have been constructed [3], which are capable of solving relatively complicated problems stated in a declarative way. However, these systems put emphasis mainly on finding a feasible solu- tion, they do not rank the different solutions. In some cases this will not suffice. When optimum criteria is meaningful to define, an optimal solution can be of importance. The present paper gives a brief overview of constraints, emphasizes the importance of searching for an optimal solution; and gives an application area where the optimal solution would highly improve the performance of the embedding system
    corecore